AMF: Aggregated Mondrian Forests for Online Learning
نویسندگان
چکیده
Abstract Random forest (RF) is one of the algorithms choice in many supervised learning applications, be it classification or regression. The appeal such tree-ensemble methods comes from a combination several characteristics: remarkable accuracy variety tasks, small number parameters to tune, robustness with respect features scaling, reasonable computational cost for training and prediction, their suitability high-dimensional settings. most commonly used RF variants, however, are ‘offline’ algorithms, which require availability whole dataset at once. In this paper, we introduce AMF, an online algorithm based on Mondrian Forests. Using variant context tree weighting algorithm, show that possible efficiently perform exact aggregation over all prunings trees; particular, enables obtain truly parameter-free competitive optimal pruning tree, thus adaptive unknown regularity regression function. Numerical experiments AMF strong baselines large datasets multi-class classification.
منابع مشابه
Mondrian Forests: Efficient Online Random Forests
Ensembles of randomized decision trees, usually referred to as random forests, are widely used for classification and regression tasks in machine learning and statistics. Random forests achieve competitive predictive performance and are computationally efficient to train and test, making them excellent candidates for real-world prediction tasks. The most popular random forest variants (such as ...
متن کاملUniversal consistency and minimax rates for online Mondrian Forests
We establish the consistency of an algorithm of Mondrian Forest [LRT14, LRT16], a randomized classification algorithm that can be implemented online. First, we amend the original Mondrian Forest algorithm proposed in [LRT14], that considers a fixed lifetime parameter. Indeed, the fact that this parameter is fixed actually hinders statistical consistency of the original procedure. Our modified M...
متن کاملMondrian Forests for Large-Scale Regression when Uncertainty Matters
Many real-world regression problems demand a measure of the uncertainty associated with each prediction. Standard decision forests deliver efficient state-of-the-art predictive performance, but high-quality uncertainty estimates are lacking. Gaussian processes (GPs) deliver uncertainty estimates, but scaling GPs to large-scale data sets comes at the cost of approximating the uncertainty estimat...
متن کاملAggregated Recommendation through Random Forests
Aggregated recommendation refers to the process of suggesting one kind of items to a group of users. Compared to user-oriented or item-oriented approaches, it is more general and, therefore, more appropriate for cold-start recommendation. In this paper, we propose a random forest approach to create aggregated recommender systems. The approach is used to predict the rating of a group of users to...
متن کاملThe Mondrian Process for Machine Learning
This report is concerned with the Mondrian process [1] and its applications in machine learning. The Mondrian process is a guillotine-partition-valued stochastic process that possesses an elegant self-consistency property. The first part of the report uses simple concepts from applied probability to define the Mondrian process and explore its properties. The Mondrian process has been used as th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of The Royal Statistical Society Series B-statistical Methodology
سال: 2021
ISSN: ['1467-9868', '1369-7412']
DOI: https://doi.org/10.1111/rssb.12425